AITopics | pruning technique

ABSTRACT Transformer-based models have become the state of the art across multiple domains, from natural language processing to machine listening, thanks to attention mechanisms. However, the attention layers require a large number of parameters and high-end hardware for both training and inference. We propose a novel pruning technique targeted explicitly at the attention mechanism, where we decouple the pruning of the four layers in the attention block, namely: query, keys, values and outputs' projection matrices. We also investigate pruning strategies to prune along the head and channel dimensions, and compare the performance of the Audio Spectrogram Transformer (AST) [1] model under different pruning scenarios. Our results show that even by pruning 50% of the attention parameters we incur in performance degradation of less than 1%.

machine learning, natural language, pruning, (17 more...)

arXiv.org Artificial Intelligence

2509.26207

Country:

North America > United States (0.14)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Scalable Interconnect Learning in Boolean Networks

Kresse, Fabian, Yu, Emily, Lampert, Christoph H.

arXiv.org Artificial IntelligenceSep-19-2025

Learned Differentiable Boolean Logic Networks (DBNs) already deliver efficient inference on resource-constrained hardware. We extend them with a trainable, differentiable interconnect whose parameter count remains constant as input width grows, allowing DBNs to scale to far wider layers than earlier learnable-interconnect designs while preserving their advantageous accuracy. To further reduce model size, we propose two complementary pruning stages: an SAT-based logic equivalence pass that removes redundant gates without affecting performance, and a similarity-based, data-driven pass that outperforms a magnitude-style greedy baseline and offers a superior compression-accuracy trade-off.

artificial intelligence, interconnect, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.02585

Country: Europe > Austria (0.05)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

HYDRA: Pruning Adversarially Robust Neural Networks

Neural Information Processing SystemsAug-17-2025, 00:25:28 GMT

While the research community has extensively explored the use of robust training and network pruning independently to address one of these challenges, only a few recent works have studied them jointly.

artificial intelligence, machine learning, pruning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

b17c0907e67d868b4e0feb43dbbe6f11-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 22:20:40 GMT

machine learning, natural language, sparsity, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Energy-Aware LLMs: A step towards sustainable AI for downstream applications

Tran, Nguyen Phuc, Jaumard, Brigitte, Delgado, Oscar

arXiv.org Artificial IntelligenceMar-22-2025

Advanced Large Language Models (LLMs) have revolutionized various fields, including communication networks, sparking an innovation wave that has led to new applications and services, and significantly enhanced solution schemes. Despite all these impressive developments, most LLMs typically require huge computational resources, resulting in terribly high energy consumption. Thus, this research study proposes an end-to-end pipeline that investigates the trade-off between energy efficiency and model performance for an LLM during fault ticket analysis in communication networks. It further evaluates the pipeline performance using two real-world datasets for the tasks of root cause analysis and response feedback in a communication network. Our results show that an appropriate combination of quantization and pruning techniques is able to reduce energy consumption while significantly improving model performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.17783

Country:

North America > Canada > Quebec > Montreal (0.15)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

EvoP: Robust LLM Inference via Evolutionary Pruning

Wu, Shangyu, Du, Hongchao, Xiong, Ying, Chen, Shuai, Kuo, Tei-wei, Guan, Nan, Xue, Chun Jason

arXiv.org Artificial IntelligenceFeb-19-2025

Large Language Models (LLMs) have achieved remarkable success in natural language processing tasks, but their massive size and computational demands hinder their deployment in resource-constrained environments. Existing structured pruning methods address this issue by removing redundant structures (e.g., elements, channels, layers) from the model. However, these methods employ a heuristic pruning strategy, which leads to suboptimal performance. Besides, they also ignore the data characteristics when pruning the model. To overcome these limitations, we propose EvoP, an evolutionary pruning framework for robust LLM inference. EvoP first presents a cluster-based calibration dataset sampling (CCDS) strategy for creating a more diverse calibration dataset. EvoP then introduces an evolutionary pruning pattern searching (EPPS) method to find the optimal pruning pattern. Compared to existing structured pruning techniques, EvoP achieves the best performance while maintaining the best efficiency. Experiments across different LLMs and different downstream tasks validate the effectiveness of the proposed EvoP, making it a practical and scalable solution for deploying LLMs in real-world applications.

calibration dataset, pruning pattern, sparsity, (14 more...)

arXiv.org Artificial Intelligence

2502.1491

Country:

Europe > Austria > Vienna (0.15)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
(9 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Review for NeurIPS paper: HYDRA: Pruning Adversarially Robust Neural Networks

Neural Information Processing SystemsFeb-7-2025, 11:02:29 GMT

Weaknesses: - It is not clear Hydra improves on adversarial attacks. It looks like test accuracy (benign) correlates with the adversarial accuracy (see Table:1). This is also observed by authors indirectly L217: Our results confirm that the compressed networks show similar trend as non-compressed nets with these attacks . It looks like as long as models are compressed properly the resulting models seem to be robust similar to dense networks. Therefore it is important to evaluate some SOTA sparse networks on compare them with HYDRA.

neurips paper, pruning adversarially robust neural network, robust training objective, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback